Stacking for machine learning redshifts applied to SDSS galaxies

نویسندگان

  • Roman Zitlau
  • Ben Hoyle
  • Kerstin Paech
  • Jochen Weller
  • Markus Michael Rau
  • Stella Seitz
چکیده

We present an analysis of a general machine learning technique called ‘stacking’ for the estimation of photometric redshifts. Stacking techniques can feed the photometric redshift estimate, as output by a base algorithm, back into the same algorithm as an additional input feature in a subsequent learning round. We shown how all tested base algorithms benefit from at least one additional stacking round (or layer). To demonstrate the benefit of stacking, we apply the method to both unsupervised machine learning techniques based on self-organising maps (SOMs), and supervised machine learning methods based on decision trees. We explore a range of stacking architectures, such as the number of layers and the number of base learners per layer. Finally we explore the effectiveness of stacking even when using a successful algorithm such as AdaBoost. We observe a significant improvement of between 1.9% and 21% on all computed metrics when stacking is applied to weak learners (such as SOMs and decision trees). When applied to strong learning algorithms (such as AdaBoost) the ratio of improvement shrinks, but still remains positive and is between 0.4% and 2.5% for the explored metrics and comes at almost no additional computational cost.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature importance for machine learning redshifts applied to SDSS galaxies

We present an analysis of importance feature selection applied to photometric redshift estimation using the machine learning architecture Decision Trees with the ensemble learning routine Adaboost (hereafter RDF). We select a list of 85 easily measured (or derived) photometric quantities (or ‘features’) and spectroscopic redshifts for almost two million galaxies from the Sloan Digital Sky Surve...

متن کامل

A Census of Object Types and Redshift Estimates in the SDSS Photometric Catalog from a Trained Decision-Tree Classifier

We have applied ClassX, an oblique decision tree classifier optimized for astronomical analysis, to the homogeneous multicolor imaging data base of the Sloan Digital Sky Survey (SDSS), training the software on subsets of SDSS objects whose nature is precisely known via spectroscopy. We find that the software, using photometric data only, correctly classifies a very large fraction of the objects...

متن کامل

Evolution of Galaxy Luminosity Function and Luminosity Function by Density

Using galaxy sample observed by the BATC large-field multi-color sky survey and galaxy data of SDSS in the overlapped fields, we study the dependence of the restframe r-band galaxy luminosity function on redshift and on large-scale environment. The large-scale environment is defined by isodensity contour with density contrast δρ/ρ. The data set is a composite sample of 69,671 galaxies with reds...

متن کامل

METAPHOR: Probability density estimation for machine learning based photometric redshifts

We present METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts), a method able to provide a reliable PDF for photometric galaxy redshifts estimated through empirical techniques. METAPHOR is a modular workflow, mainly based on the MLPQNA neural network as internal engine to derive photometric galaxy redshifts, but giving the possibility to easily replace MLPQNA with any...

متن کامل

Comments on the Redshift Distribution of 44,200 SDSS Quasars: Evidence for Predicted Preferred Redshifts?

A Sloan Digital Sky Survey (SDSS) source sample containing 44,200 quasar redshifts is examined. Although arguments have been put forth to explain some of the structure observed in the redshift distribution, it is argued here that this structure may just as easily be explained by the presence of previously predicted preferred redshifts. Subject headings: galaxies: active galaxies: distances and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1602.06294  شماره 

صفحات  -

تاریخ انتشار 2016